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Dynamical Fermions with Fat Links 
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We present and test a new method for simulating dynamical fermions with fat links. Our con- 
struction is based on the introduction of auxiliary but dynamical gauge fields and works with any 
fermionic action and can be combined with any fermionic updating. In our simulation we use an 
over-relaxation step which makes it effective. For four flavors of staggered fermions first results 
indicate that flavor symmetry at a lattice spacing a sa 0.2 fm is restored to a few percent. With the 
standard action this amount of flavor symmetry restoration is achieved ataRi 0.07 fm. We estimate 
that the overall computational cost is reduced by at least a factor 10. 
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I. INTRODUCTION 



Improved actions reduce lattice artifacts and thus allow simulations at coarser lattice spacing, larger lattice quark 
masses and smaller lattice volumes. The use of improved actions in large scale numerical simulations has been 
increasing steadily. For dynamical simulations a factor of two improvement in lattice spacing can easily translate 
to a gain of 100 in computational cost, which usually more than compensates for the reduction in efficiency of the 
algorithm. Systematic improvement programs remove lattice artifacts perturbatively Jl|, ||, |[ §] or non-perturbatively 
H ||, 0, ||] by adding irrelevant operators to the action. The use of fat or smeared links is part of many of these 
improvement programs |Io| . Smearing the links of a lattice action does not change the long-distance properties of 
the system but by smoothing out short scale lattice vacuum fluctuations, it reduces lattice artifacts. Fat link actions 
by themselves show improved scaling properties, especially in quantities most sensitive to short distance fluctuations. 

For Wilson-type clover fermions, chiral properties show significant improvement in fat link quenched simulations 
jll]]. Fattening removes many small dislocations and that reduces the spread of the real eigenmodes of the Dirac 
operator and the occurrence of exceptional configurations. The perturbation theory for fat-link Wilson-type fermions 
has been worked out in djjj, showing that the additive mass renormalization is small, the renormalization factors are 
very close to one and the tree- level clover coefficient csw = 1-0 is expected to be close to the non-perturbative value. 

Fat links have also been successfully used in overlap actions |fl3| , |l4|| . The improved chiral properties of fat link 
actions result in significantly faster convergence in evaluating Neuberger's formula. 

The use of fat links with staggered fermions improves flavor symmetry. In the staggered fermion formulation the 
four components of a Dirac spinor occupy different lattice sites and connect to different gauge fields, leading to flavor 
symmetry breaking. The flavor symmetry breaking is especially evident for the pions: only one of the pseudoscalar 
mesons is a true Goldstone particle, the others are massive even at vanishing quark mass. Since flavor symmetry 
breaking is basically due to the fluctuations of the gauge fields within a hypercube and is particularly sensitive to 
dislocations, local smearing of the gauge links is very effective in reducing flavor symmetry breaking Several quenched 
simulations verified this conjecture ||l5, 16, 17). Dynamical simulations with one level of smearing jig, [Ts|| found similar 
improvement. Perturbative studies of flavor symmetry breaking also support the use of fat links |20|| 

Dynamical simulation of fermion actions with fat links can be very complicated. Even in the simplest case where 
the fat link is constructed as a sum of several paths connecting the fermions, the fermionic force term will have many 
more terms than with thin link action. If the fat link is projected back to SU(3) and the fattening procedure is iterated 
(as proved to be most effective in quenched simulations) , direct calculation of the fermion force term becomes nearly 
impossible. 

In this article we present a new method for simulating fat link fermion actions with many levels of projected 
smearing. The basic idea is to introduce an auxiliary but dynamical gauge field for each smearing level. These gauge 
fields couple to each other by blocking kernels representing one level of smearing. The last of the auxiliary gauge fields 
couple directly to the fermions just like ordinary thin links, thus avoiding the complicated gauge force computations. 
Our construction does not consider the systematic improvement of the action, but it can be combined with any thin 
link fermionic action. Combining systematic improvements of a thin link action with fat link fermions can lead to 
further systematic improvement. 

To motivate our choice of fat link action in Sect. 2 we study flavor symmetry breaking with staggered fermions 
in the quenched approximation. We consider valence actions with different levels of smearing and we compare the 
results with the standard thin link action. The quenched simulation suggests that with three levels of projected fat 
links flavor symmetry improves to a level corresponding to a factor of 2.5 change in lattice spacing. 

In Sect. 3 we present our general construction of a fat link fermion action. We suggest combining different updating 
methods in the numerical simulation. The simplest is to update the original and all but the last auxiliary gauge fields 
using Metropolis updating. The last level auxiliary gauge field couples to the fermions. Any fermionic updating can 
be used, we consider molecular dynamics updating here. 

Unfortunately this coupled system is very rigid and evolves extremely slowly under local updating. In Sect. 4 we 
discuss a global over-relaxation step that improves the situation considerably. Over-relaxation updating based on the 
gauge action is usually not very effective for fermions The situation here is quite different. Since the fermions 
couple to a several times smeared smooth gauge field, we find that one can update up to (0.3 fm) 4 part of the lattice 
and still have a large enough acceptance rate to make the algorithm efficient. 

In Sect. 4 we specify the action for four flavors of staggered fermions and discuss the over-relaxation update 
in detail. One iteration of our algorithm is composed of 100 over-relaxation, one Metropolis and one molecular 
dynamics steps. This combined updating is about 15-20 times slower than a molecular dynamics thin link updating 
and has about the same autocorrelation times. Considering that we gain well over a factor of 100 from the improved 
scaling properties, this cost is acceptable. Our first results with this algorithm confirms the quenched results for 
flavor symmetry breaking. We find that flavor symmetry violations are reduced to a few percent at a lattice spacing 
aw 0.20 fm. 
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FIG. 1: The mass of the lightest non-Goldstone pion as the function of the mass m g of the Goldstone pion in the quenched 
approximation. Fig. la) shows results for (3 = 5.7 using both a thin link and different fat link valence actions. In Fig. lb) 
results obtained with our best fat link action on f3 = 5.7 background are compared with thin link action results on (3 = 5.7, 
(3 = 6.0 and /3 = 6.2 background configurations. The lattice spacing changes by a factor of 2.5 between (3 = 5.7 and (3 = 6.2. 



In Sect. 5 we summarize our results and discuss the future directions. 



II. FAT LINK ACTIONS AND QUENCHED SPECTROSCOPY 

In this section we investigate the spectrum of fat link staggered actions in the quenched approximation. Previous 
extensive studies jlTj have demonstrated the improvement in the restoration of flavor symmetry due to the smearing 
of the gauge links. Our goal here is to motivate the parameters of our dynamical fat link action. 

We fatten the gauge links using APE smearing [g2|: the smeared or fat link Q is constructed from the thin link U 

as 

Q i>fl = (l-a)C/^ + -E i;M (C7), (1) 

where £i jAI (t/) is the sum over the six staples around the link Ui ifl . We use the index i to label the lattice sites and 
the index /i to label the four space-time directions. The smearing procedure eq. can be iterated if the fat link Q 
is projected to SU(3) 

W i ^ = Proj SX j i3) {Q ijll }. (2) 

The n-th level fat link is given by 

Q% = (1 - aJWijT 1 ' + f ^(WC"- 1 )) , (3) 

where Zi^{W (n ~ iy ) is the sum of staples around W^ l \ the (n-l)-th level fat link projected onto SU(3) (W (0) = U). 
In the following we label by N the number of smearing iterations or levels. Perturbative arguments Jl^| show that 
for values of the smearing parameter < a < 0.75 the smearing orders the gauge configuration suppressing small 
scale fluctuations. If a > 0.75 the smearing eventually disorders. In the following we choose, somewhat arbitrarily, 
a = 0.70. 

We consider two different SU(3) projections : a deterministic projection W m nx,i,n defined by 

ReTr(W m ax,i >M QL) = max ReTr(W Ql J (4) 
■ >i* ~*hn wesu(3) ^ l - fJ - 

and a probabilistic projection W i ; „, where is chosen according to the probability distribution 

P(W) oc exp[^ReTr(^Q! !A1 )] (5) 
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FIG. 2: The mass renormalization in the quenched approximation at f3 = 5.7 for the thin link and different fat link valence 
actions. In addition to the actions used in Fig. [j], we show the results for N = 1 level of non-projected (np.) APE smearing. 



with projection parameter A. For A = oo, eq. (||) is equivalent to eq. (||). 

To illustrate the effect of fat links on flavor symmetry restoration we calculated the pion spectrum on a set of 8 3 x 24, 
j3 = 5.7 (a = 0.17fm) quenched lattices. In Fig. |j]a) we plot the masses of the (would-be) Goldstone pion ir g and the 
lightest non-Goldstone pion, 7^5, corresponding to the representation 75 <S> 7^75. We label the representation of the 
states following [^3[ |24| by Ts ® Ef, where Ts labels the spin and Tf labels the SU(4) flavor. The pion masses are 
expressed in units of the Sommer scale ro (ro = 2.87a at (3 = 5.7) In addition to the thin link we consider three 
fat link valence actions: N = 1 and N — 3 levels of smearing with projection parameter A = 500 and N — 3 levels 
of smearing with projection parameter A = 100, all with a — 0.7. We observe considerable improvement from N = 1 
to N = 3. Also, A = 100 for the projection parameter is clearly not as effective as A = 500. Increasing the number 
of blocking levels N or the projection parameter A further does not improve the situation substantially but increases 
the computational effort considerably. In the rest of this paper we will consider the fat link action corresponding to 
N=3 levels of blocking with projection parameter A = 500 and APE parameter a = 0.7. While these parameters are 
not unique, they seem to be close to optimal. 

To get a feel for the amount of improvement one can achieve with this fat link actions, in Fig. [j]b) we compare 
the N — 3, A = 500 action with thin link data at j3 = 5.7, 6.0 and 6.2. The data for the last two j3 values are from 
ref. and correspond to lattice spacings a — 0.095 fm and 0.069 fm and Sommer parameters r = 5.26a and 7.25a, 
respectively J2(|. The N = 3 , A = 500 smearing reduces flavor symmetry violations on the (3 = 5.7 configurations to 
the level obtained at (3 — 6.2 with thin link action, a factor of 2.5 improvement in lattice spacing. The computational 
cost of full QCD grows at least as 1/a 6 || and for certain quantities even as 1/a 10 [^7|. A factor 2.5 in the lattice 
spacing means a factor 200 — 10 4 in the computational costs. 

Finally, in Fig. || we plot the square of the Goldstone-pion mass as a function of the bare quark mass. All 
measurements are on the /3 = 5.7 gauge configurations using thin link and different fat link actions. In addition to the 
actions considered in Fig. [ij we show the results for N = 1 level of non-projected APE smearing. Fat link perturbation 
theory predicts that the mass renormalization constant becomes perturbative as N increases. We see that the mass 
renormalization after one level of non-projected smearing is almost the same as without smearing. Increasing the 
smearing level and the value of A indeed reduces the multiplicative mass renormalization factor. 



III. DYNAMICAL FERMIONS WITH FAT LINKS 



The quenched results show that smearing the gauge links considerably improves the flavor symmetry of staggered 
fermions. There is evidence that these results carry over to dynamical simulations where one level of smearing 
is implemented. Because of the gauge force computations in the molecular dynamics equations of motion, it is very 
complicated to simulate fermions with many levels of smearing, even impossible if projection of the gauge links onto 
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SU(3) is made after each smearing step. 

Our method to overcome this difficulty is to introduce of a set of auxiliary but dynamical gauge fields which couple 
to each other in the action by blocking kernels representing one level of smearing. The last level auxiliary gauge field 
couples directly to the fermions in the same way the usual thin links do. The problem of the computations of the 
gauge force is transferred to the gauge sector where it can be solved. This construction can be used for any fermion 
action which can be simulated with "thin" links. 

Let us start considering fermions coupled to fat links constructed with one level of smearing from the thin links 
Ui 4L . We introduce a dynamical auxiliary gauge field V and we define the action 

S = -§£ ReTr([/ p ) - KeTr(V^Wl^jU)) - tv\n[M\V)M{V)] . (6) 

P i,H 

Here p labels the plaquettes U p , W max (U) is the SU(3) projection eq. (Q) of the fat link given in eq. d) and M(V) is 
the fermion matrix. The blocking parameter A constrains the auxiliary gauge links Vi tll to be close to the projected 
fat links W max i M (J7). Fluctuations of the field V are proportional to 1/A. This is a dynamical realization of the 
projection eq. (jlj). In eq. (||), Tr means the trace over SU(3) color whereas tr means the overall trace over space-time 
indices i, directions /x, spin and color. 

For many levels of smearing, we introduce a set of dynamical auxiliary gauge fields W^ 1 ), W^ 2 \ . . . W^ N%> = V, one 
for each level of smearing. The action eq. (gp is generalized to 

S = -f E ReTr (^)-^E ReTr K>n TO (^)) 
P 

~Y, Re ^ w S w UiJ w{1) )) ■■• -^E ReTr (^.^- x ,^(^ (JV - 13 )) 

-trln[Aft(VOM(V)] . (7) 

This action is in the same universality class as the original thin link action. If the blocking parameter is A = oo, the 
auxiliary gauge fields can be integrated out and the resulting action is a plaquette gauge action while the fermions 
couple to deterministically projected fat links. If A ^ oo, the integration of the auxiliary gauge fields will introduce 
additional gauge terms. These new terms depend on the thin link variables in a complicated way, but they are all 
local terms containing only a finite number of link variables. Thus, these terms do not change the universality class 
of the action. 

The updating of this system can, in principle, be done by a sequence of Metropolis updatings for the fields 
U, . . . W^ N ~ X > and by the standard updating for the fermion matrix M, with the only difference that now 

the last auxiliary gauge field V enters the fermion matrix. 

The problem of this basic algorithm is that the system with large A is very rigid and evolves very slowly. To cure 
this problem, we use a hybrid over-relaxation algorithm |^, |2^, |3l], |3^, |3^, |34|, [3j| |3(], |37|] with 

* Metropolis updating for the gauge fields U, . . . W^^ 1 ^, 

* standard algorithm for the fermion matrix M(V) and 

* global over-relaxation for all the gauge fields U, . . . V 

In the next section we discuss how an over-relaxation can be implemented. It plays a key role in reducing the 
autocorrelation times of the Metropolis update. Note that the over-relaxation update is effective only because the 
fermions couple to the smooth fat links. 

IV. OVER-RELAXATION WITH FAT LINKS 

Our over-relaxation update is based on the usual over-relaxation reflection step used in pure gauge systems that 
leaves the gauge action invariant ]3^ , |33| . We reflect the thin links just like in the standard over-relaxation algorithm 
by sweeping in a given order through some part (or all) of the lattice 



U -»■ U' 



(8) 
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These changes are followed by a transformation of the fat links 

w v>' = w^wLAu)w B uu') , (9) 
= w^w^w^w^w^') 7 (10) 

V = vwi &K {w {N -V)w m uw {N ~ 1] '). (11) 

All links for a given level are reflected and then the next level gauge field is changed. The reflections must be performed 
in the order given by eq. (||)-eq. ([Tl|). This transformation leaves the gauge part of the action invariant, but the 
fermionic part will change and a Metropolis accept/reject step must be performed. In order for this updating to satisfy 
detailed balance, the probability P for changing the gauge field configuration {U, . . . V} to {{/', W^ 1 '' , . . . V'} 

has to be equal to the probability for the reversed change, i.e. 

P({U, W {1 \ . . . V} -» {XT', VF (1)/ , . . . V'}) = P({U', W (1) \ . . . V'} — > {U, W {1 \ ...V}). (12) 

This can be achieved by choosing with probability 1/2 either a given sequence of the thin link reflections eq. (||) or 
with equal probability the reversed sequence J2l[ |. The sequence has to be reversed with respect to the direction and 
location of the thin links and with respect to the index of the SU(2) subgroup. The thin link reflections are then 
followed by the local reflections eq. (||)-eq. ( |TT| ) for the auxiliary gauge links. For a given level of auxiliary gauge field 
the reflections are independent of the order with which we sweep through the lattice and they are symmetric under 
exchange of primed and unprimed quantities. If we start from a gauge field configuration {U, . . . V} and apply 

eq. @-eq. ( |lT| ) twice with the sequence of thin link reflections reversed, we come back to the original configuration. 
This over-relaxation algorithm satisfies detailed balance and is a legal update of the system. To achieve ergodicity 
though, one must still perform some updatings with the basic algorithm. 



For the Metropolis accept-reject step following the changes eq. (|8j)-eq. flll|) we have to calculate the action, i.e. we 
need an explicit form of the determinant to evaluate. For four flavors of staggered and two flavors of Wilson fermions 
this is trivially realized. In terms of pseudofermion fields $ the action eq. (0) is 

S = S g (U,W il \...V) + &[M^{V)M(V)]- 1 $, (13) 

where M(V) represents the fermion matrix. The equilibrium probability distribution of the gauge fields U, . . . V 

is given by 

P cq (U,W (1 \...V) oc e -s g (u,w^,...v) det (M\V)M(V)). (14) 

Since the pure gauge part of the action is invariant under the over-relaxation move of the system, the acceptance 
probability is 

P acc (V,V) = mmjl, det{Mf{v)M(v)) ) ■ (15) 
Pace depends only on the last level auxiliary gau ge li nks. The goal is to efficiently compute the acceptance probability. 



Instead of calculating the determinant in eq. (14) we use a stochastic estimator to approximate Pace |p8 , 

PLc(y',V) = minjl.e^^^')^-^^)^^}, (16) 
where the vector £ is generated according to the distribution 

P(0 oc e -S tMt ^')M(V)S. 

After configuration averaging, this procedure satisfies the detailed balance conditio n p9[ |. 

If the gauge fields V and V are very different, the acceptance rate from eq. (|16[) will be small even when the 
fermionic determinants are actually very close. This is because of the large fluctuations of the stochastic estimator. 
To improve the acceptance rate, we attempt to remove the most ultraviolet part of the fermionic matrix and include 
it explicitly as an effective gauge action. The resulting reduced matrix gives a much higher acceptance in eq. (]l6|). 

For heavy fermions the fermion determinant gives rise to an effective plaquette term for the gauge field |40). Even 
for small quark masses the ultraviolet part of the fermion determinant can be well approximated by an effective loop 
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action involving only small Wilson loops jll], [1^, Q . These observations suggest to remove the plaquette term from 
the fermion matrix by introducing a reduced matrix M r as 

M(V) = M r (V)A(V) with (18) 
A(V) = c ^(v )+a2 D\v)^ (19) 

where D is the kinetic part of the fermion matrix. In terms of M r the fermion determinant becomes 

det{M\V)M{V)) = &et{Ml{V)M r {V))e- s * a(v \ (20) 
S cff (V0 = -2 a4 RetT[D 4 {V)] - 2a 2 Retr[D 2 {V)] . (21) 

The effective action S e s{V) can be evaluated explicitly. In general it is the sum of a plaquette term coming from 
tr[Z? 4 (F)] and a constant from tr[D 2 (V)]. The real parameters aj and a± are free and can be optimized. In a 
different context, a decomposition of the fermion matrix like eq. (18) has been proposed in |^l| motivated by the 
hopping parameter expansion. 

In terms of S e s(V) and M r (V) the acceptance probability is 

p„<y;v) = ^/^-^detCMywcn)) . (22) 

V ' ' \ ' e- s e«(v) det(M}(V)M r (V)) J 1 ; 

It can be approximated by generating a vector £ according to the distribution 

P(0 OC e -?*4W'Wr(V')t (23) 

and eq. (^2f ) becomes 

P " CC {V',V) = rpin U^S^^-SMV^+iHMHv'WAV^-M^MrjV)]^ ( 2 4) 
In practice, we start by generating a random Gaussian source R according to the probability distribution 

P(R) cx e- RiR (25) 

from which we form the vectors 

$' = M^(V')R, (26) 

X' = [M\V I )M{V')]- 1 ^' . (27) 

The vector £ in eq. ( p3| ) is then given by £; = A(V')X' and we can write the fermionic terms in eq. ( p^ ) as 

tfM}(y')M r (y')z = ^x 1 , (28) 

f + M,t(V r )M r (V))^ = X^A\V')A^{V)- 1 M\V)M{V)A{V)- 1 A{V I )X' . (29) 

The acceptance probability strongly depends on the parameters a 2 and Q4. The optimal value can be found numeri- 
cally. We will discuss our choice in the next section. 

This procedure is not effective if the fermion matrix depends on thin links, because the fluctuations in the stochastic 
estimator are too large even after the removal of the D 4 and D 2 terms. When the links in the fermion matrix are 
smeared, these fluctuations are constrained. This is the key feature which makes the over-relaxation effective with fat 
links. 

V. PERFORMANCE OF THE ALGORITHM 

For testing our fat link action we decided to simulate Nf = 4 flavors of staggered fermions. The fermion matrix is 
given by 

M(V) hJ = 2m5 i>j +Y,ViAVi,»S itj - fl -Vl li J i , j+ll ), (30) 
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FIG. 3: The effective physical volume where the U links change in one global over-relaxation (GOR) updating step as function 
of the physical volume "touched". The results are obtained on 8 3 x 24 lattices. 

where ry^ are the staggered phases. We impose anti-periodic boundary conditions in the time direction. The 
pseudofermion field $ and the matrix (V)M(V) are restricted to the even sites of the lattice The matrix D 
used in eq. ( |l9| ) is given by 

Did = ^iAVi^kj-fi-VitJij+fi)- (31) 

The traces in S c g{V) in eq. ( ^l|) are computed by summing over the even sites only and give 

tr[D 4 (V)] = 24ft[3 - -L ^R e Tr(F p )] + 1080 - 4^,4 J2 ReTr (^) . ( 32 ) 

p i,t=0 

tr[D 2 (v)] = -nn. (33) 

fl denotes the total number of lattice points and we assume that there are more than 4 sites in each space- like direction. 
For Nt — 4 sites in the time- like direction there is an extra contribution to tr|X) 4 (V)l coming from the Polyakov lines 
Pi starting at location i,t = 0. The minus sign of the Polyakov line in eq. (|32|) is due to the anti-periodic boundary 
conditions in time direction. 

As we described in Sect. 2 we choose the number of auxiliary gauge fields and the smearing parameters to be 

iV = 3, a = 0.7, A = 500. (34) 



The global over-relaxation (GOR) described in Sect. 4 is essential here as it considerably reduces the autocorrelations 
of the otherwise very rigid system. The GOR leaves the pure gauge action of our system invariant but is subjected 
to an accept/reject step which accounts for the ratio of fermion determinants, see eq. (pi]). Since the fermions couple 
directly to the last level dynamical fat links, the acceptance rate acoR is large enough to make the algorithm effective. 

The parameters 012 and 014 entering eq. (|l9|) are chosen to maximize the acceptance rate cvgor- We use 



a 2 = -0.18 , a 4 = -0.006. (35) 

Setting a2 = cti — 0, i.e. computing the ratio of fermion determinants without decomposing the fermion matrix 
according to eq. (18) and eq. (19) gives a value for acoR which is a factor 10 smaller than what we achieve with our 
choice. Moreover, keeping 0:4 = and varying only a 2 gives significantly lower ogor- The choice of eq. ( |35| ) is not 
unique, we identified in the a 2 — «4 parameter space a band-like region in which ckgor reaches its maximal value. 

Even with our improved GOR it is not possible to change simultaneously all the U links, the effectiveness of the 
algorithm would be very low. Instead we choose a random block of U links containing (Agor/4) sites and reflect only 
the links within this block. These changes propagate more and more through the lattice as we consider the cascade 
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links 


100 x GOR 


1 xMET 


1 x HMC 


total 


CG iterations 


thin 






1 


1 


133 


fat 


11.0 


3.5 


0.5 


15.0 


61 



TABLE I: Timings for simulation of ordinary thin link HMC algorithm compared to simulation of our algorithm with N = 3 
auxiliary gauge fields. The dynamical lattices are 8 3 x 24 and the lattice spacings and physical quark masses of the two 
simulations are approximately matched. The time unit is one updating step (consisting of one HMC trajectory) of the ordinary 
thin link algorithm. The last two columns give the total time costs and the average number of conjugate gradient (CG) 
iterations needed per inversion of M^M. 
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FIG. 4: Flavor symmetry violation for the dynamical runs with our fat link action (fat. dyn.) and with the standard thin 
link staggered action (std. dyn.). The mass m w of the lightest non-Goldstone pion is plotted as function of the mass m g of 
the Goldstone pion (in units of ro). The lattice spacings and the correlation lengths rom g are approximately matched. For 
comparison we plot the quenched results from Fig. |l| obtained with the corresponding valence actions (fat. quen. and std. 
quen.). 
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of reflections eq. (^|)-eq. (|Ti"|), as 19 fat links have to be changed by changing one link. In Fig. || we plot the volume 
of the lattice in which the U links are effectively updated (actually the fourth root of it), i.e. (aGORA r GOR/4) 1 ^ 4 a 
as function of the physical volume of the "touched" U links (iVGOR/4) 1 ^ 4 a. These results are obtained on 8 3 x 24, 
(3 = 5.2 and am = 0.1 lattices. The lattice spacing from the string tension can be estimated as a rj 0.2 fm and the 
correlation length as rom g 1.7. We see in Fig. || that there is a maximal physical volume 

Vgor « (0.3 fm) 4 (36) 

which can be updated with a reasonable acceptance rate. The actual value of chgor depends on the number of "touched" 
links -/Vqor- The broad maximum in Fig. || corresponds to ckgor = 35% (./Vgor = 64), ogor = 26% (-/Vgor = 96) 
and «gor = 16% (-/Vqor = 144) on the above lattices. 

We observed that the value eq. @) scales with the lattice spacing. On lattices with smaller lattice spacing more 
links can be updated at the same time, i.e. the physical volume of the updated region remains fixed. On larger physical 
volumes, on the other hand, one would have to increase the number of GOR steps to achieve the same efficiency. In 
our finite-temperature runs on 8 3 x 4 lattices jljj] we observed that the effectiveness of the algorithm increases in 
the deconfined phase. Somewhat surprisingly, the effectiveness also increases with decreasing quark mass. This fact 
together with the faster convergence of the conjugate gradient (see Table |) allows simulations at quark masses that 
are not practically possible with the standard thin link action. 

In simulating the 8 3 x 24 ss 20 fm 4 lattices, we found it effective to follow each Metropolis and HMC by 100 GOR 
updating steps, each reflecting about (0.3 fm) 4 section of the thin link lattices. For timing our algorithm we consider 
the computer time necessary for 100 GOR steps, one Metropolis (MET) step for each gauge field U, and 
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one HMC trajectory for the V field with step size At = 0.015 and A^ ra j = 30 steps. We compare this to the time of 
one HMC trajectory with step size At = 0.015 and iVtraj = 30 steps for the thin link action {l6|, ^] with parameters 
j3 = 5.25 and am = 0.06. With these parameters the lattice spacings and the physical quark masses of the two actions 
are approximately matched. To compare the times for one updating iteration is fair because the autocorrelations for 
simple observables like the plaquette or the chiral condensate ifnp are observed to be the same for the two algorithms 
in these time units. 

In Table | we show the results of the timing comparison. We use one iteration of ordinary thin link HMC as time 
unit. One iteration of our algorithm costs a factor 15 more but, as pointed out in Sect. 2, we can effectively gain a 
factor 10 2 — 10 4 in computer time due to improved scaling. With fat links, there is also a considerable reduction (a 
factor 2) in the number of conjugate gradient (CG) iterations needed for the inversion of M>M, as shown in the last 
column of Table |. 

Finally we consider flavor symmetry restoration on the dynamical fat link lattices. Fig. ^ shows the first results 
obtained with our fat link action on 8 3 x 24 lattices with parameters j3 — 5.2, am — 0.1 (square). The lattice spacing is 
a ~ 0.2 fm and the Goldstone pion to rho mass ratio is m g /m p — 0.71(1), indicating a correlation length rom g w 1.7. 
This point agrees very well with the quenched results (octagons) taken from Fig. [y, i.e. flavor symmetry violation 
is reduced to a few percent. To show the improvement due to the smearing of the gauge links, we also ran the 
standard thin link staggered action with two sets of parameters, j3 = 5.25, am = 0.06 (fancy square) and /3 = 5.2, 
am = 0.06 (burst). The lattice spacings from the string tension are approximately a « 0.18 fm and a « 0.22 fm, and 
the correlation lengths are r m g rj 1.8 and r m g « 1.5, respectively. The flavor symmetry violations on the thin link 
dynamical lattices agree with the quenched predictions (fancy diamonds). Our fat link dynamical action has only 
about 6% flavor symmetry violation compared to 60% of thin link actions at comparable lattice spacings. 



VI. CONCLUSIONS 



We presented a new method for simulating dynamical fermions with fat links constructed through many levels of 
projected smearing. For each level of smearing we introduce an auxiliary but dynamical gauge field and these gauge 
fields are connected to each other by blocking kernels representing one level of smearing. Since the last of the auxiliary 
fields couple in the standard way to the fermions, our construction can be used with any known fermionic update. We 
discussed the simulation of our system which includes an over-relaxation updating. The fat links entering the fermion 
matrix make the over-relaxation effective and this is the key feature of our algorithm. 

At this time our algorithm is running on scalar machines and the over-relaxation step is worked out for four flavors 
of staggered fermions. The results for flavor symmetry restoration confirm the quenched results, that is, a factor 2.5 
in the lattice spacing can be gained. Taking into account that our algorithm is about 15-20 times slower than the 
standard one, this gives an overall gain of at least a factor 10 in computational costs. 

We have used this algorithm to study the finite temperature phase transition of four flavors of staggered fermions 
at N t = 4 J45|]. We observe a qualitative difference compared to thin link simulations. The strongly first order phase 
transition is washed away, we observe a very broad crossover instead. We believe this difference is due to the improved 
flavor symmetry. In our simulations we have 15 relatively light pions compared to the single Goldstone particle of 
thin link simulations. 

We are parallelizing the code for large scale simulations [Q. This requires a 32-checkerboard structure but it is 
not more complicated than parallelizing a Symanzik improved gauge action. The over-relaxation generalizes easily for 
two flavors of Wilson fermions. To generalize it for two flavors of staggered fermions we need an explicit realization 
of the action. That can be done by approximating the square root of the fermionic determinant with a polynomial 
form ||^]. Our preliminary study shows that it can be done efficiently. 
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